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^ ■ ABSTRACT 

h^. ' PINOCCHIO (PINpointing Orbit-Crossing Collapsed Hierarchical Objects) is a new 

, algorithm for identifying dark matter halos in a given numerical realisation of the 

linear density field in a hierarchical universe (Monaco et al. 2001). Mass elements are 

fSJ \ assumed to have collapsed after undergoing orbit crossing, as computed using pertur- 

bation theory. It is shown that Lagrangian perturbation theory, and in particular its 

■ ellipsoidal truncation, is able to predict accurately the collapse, in the orbit-crossing 

sense, of generic mass elements. Collapsed points are grouped into halos using an 

■ algorithm that mimics the hierarchical growth of structure through accretion and 
' mergers. Some points that have undergone orbit crossing are assigned to the network 
, of filaments and sheets that connects the halos; it is demonstrated that this network 

■ resembles closely that found in N-body simulations. The code generates a catalogue of 
' dark matter halos with known mass, position, velocity, merging history and angular 

momentum. It is shown that the predictions of the code are very accurate when com- 
pared with the results of large TV-body simulations that cover a range of cosmological 
models, box sizes and numerical resolutions. The mass function is recovered with an 
accuracy of better than 10 per cent in number density for halos with at least 30 — 50 
particles. A similar accuracy is reached in the estimate of the correlation length tq. 
The good agreement is still valid on the object-by-object level, with 70-100 per cent 
of the objects with more than 50 particles in the simulations also identified by our 
algorithm. For these objects the masses are recovered with an error of 20-40 per cent, 
and positions and velocities with a root mean square error of ^1-2 Mpc (0.5-2 grid 
lengths) and ~100 km/s, respectively. The recovery of the angular momentum of halos 
is considerably noisier and accuracy at the statistical level is achieved only by intro- 
. ducing free parameters. The algorithm requires negligible computer time as compared 

with performing a numerical TV-body simulation. 
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1 INTRODUCTION 

In Dark Matter (DM) dominated cosmological models, 
structure grows through the gravitational amplification and 
collapse of small primordial perturbations, imprinted at very 
early times by some mechanism such as inflation. In particu- 
lar in the case of Cold Dark Matter (CDM), the formation of 
structure follows a hierarchical pattern, with more massive 
halos forming from accretion of mass and mergers of smaller 
objects (see e.g. Padmanabhan 1993 for a general introduc- 
tion). Galaxies form following the collapse of gas into these 
dark matter potential wells (see, e.g., White 1996 for a re- 
view). An accurate description of the non-linear evolution 



of perturbations in the DM field is thus important for mod- 
eling the formation and evolution of astrophysical objects 
within a cosmological setting. 

The gravitational formation of dark matter halos is usu- 
ally addressed by means of A?-body simulations. However, 
a number of analytic or semi-analytic techniques based on 
Eulerian or Lagrangian perturbation theory (Bouchet 1997; 
Buchert 1997), for example the Press & Schechter (1974, PS) 
and similar techniques (see e.g. Monaco 1998 for a recent re- 
view), were devised to approximate some aspects of the grav- 
itational problem. Analytic techniques have the advantage 
of being both fast and flexible, thereby giving insight into 
the dynamics of the gravitational collapse. In particular, La- 
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grangian Perturbation Theory (LPT; Moutarde et al. 1992; 
Buchert & Ehlers 1993; Catelan 1995) and more specifically 
its linear term, the Zel'dovich (1970) approximation, were 
used to compute many properties of the density and veloc- 
ity fields in the 'mildly non-linear regime' when the density 
contrast is not very high, and particle trajectories still retain 
some memory of the initial conditions. The PS and extended 
PS approaches (Peacock & Heavens 1990; Bond et al. 1991; 
Lacey & Cole 1993) were used to generate merger histories 
of DM halos. Extensions of the PS approach to the non- 
linear regime were attempted by many authors (Cavaliere, 
Colafrancesco & Menci 1992; Monaco 1995, 1997a,b; Cava- 
liere, Menci & Tozzi 1996; Audit, Teyssier & Alimi 1997; Lee 
& Shandarin 1999; Sheth & Tormen 1999, 2001; Sheth, Mo 
& Tormen 2001). Alternative approaches assumed objects 
to form at the peaks of the linear density field (Peacock & 
Heavens 1985; Bardeen et al. 1986; Manrique & Salvador- 
Sole 1995; Bond & Myers 1996a,b; Hanami 1999), or ap- 
plied the Zel'dovich approximation to smoothed initial con- 
ditions (truncated Zel'dovich approximation, Coles, Melott 
& Shandarin 1993; Borgani, Coles & Moscardini 1994), or 
used the second-order LPT solution for the density field 
(Scoccimarro & Sheth 2001), or joined linear-theory predic- 
tions with Monte-Carlo methods such as the block-model 
(Cole & Kaiser 1988) and merging-cell model (Rodrigues & 
Thomas 1996; Nagashima & Gouda 1998; Lanzoni, Mamon 
& Guidcrdoni 2000). 

These approaches are limited to the linear or mildly 
non-linear regime and are generally unable to recover ac- 
curately the wealth of information available with a large 
numerical simulation. In particular, although PS provides a 
reasonable first approximation to the mass function of halos 
(Efstathiou et al. 1988; Lacey & Cole 1994), it underesti- 
mates the number of massive objects and overestimates the 
number of low mass ones (see, e.g., Gelb & Bertschinger 
1994; Governato et al. 1999; Jenkins et al. 2001; Bode et 
al. 2000). Similarly, the merger history of DM halos is rea- 
sonably well reproduced by the extended PS approach, but 
there are systematic differences when compared with sim- 
ulations, and also some theoretical inconsistencies (Lacey 
& Cole 1993; Somerville & Kolatt 1999; Sheth & Lemson 
1999). The clustering of halos of given mass in the PS ap- 
proach can be obtained analytically (Mo & White 1996; 
Catelan et al. 1998; Porciani, Catelan & Lacey 1999; Sheth 
& Tormen 1999; Sheth et al. 2001; Colberg et al. 2001), 
but the extended PS approach is not able to produce both 
spatial information and merger histories at the same time. 
This is true also for many of the non-linear extensions of 
the PS approach mentioned above. The merging cell model 
can provide spatial information on the halos (Lanzoni et 
al. 2000), but only in the space of initial conditions (La- 
grangian space) , while the truncated Zel'dovich approxima- 
tion, though able to predict correlation functions in the Eu- 
lerian space, is not accurate in predicting the masses of the 
single objects (Borgani et al. 1994). Finally, the peak-patch 
approach (Bond &: Myers 1996a,b) can also generate cata- 
logues of halos with spatial information, but has never been 
extended to predict the merging histories. 

Semi-analytical models of galaxy formation assume that 
the properties of a galaxy depend on the merger history of its 
associated DM halo. So in order to make predictions of the 
clustering properties of galaxies of a given type, one needs to 



be able to compute the merger history and spatial clustering 
simultaneously. Given the limitations of the analytic tech- 
niques discussed above, such models have usually resorted to 
analysing large TV-body simulations with very many snap- 
shots to reconstruct the merger histories (see. e.g., Diaferio 
et al. 1999). Alternatively, the extended PS approach is used 
to compute the merger histories, but TV-body simulations to 
obtain the spatial information on the halos statistically (e.g., 
Benson et al. 2000). 

A new approach for obtaining the spatial information 
and the merger history simultaneously for many halos was 
recently proposed by Monaco et al. (2001, hereafter pa- 
per I; see also Monaco 1999 for preliminary results). In the 
PINOCCHIO (PINpointing Orbit-Crossing Collapsed Hier- 
archical Objects) formalism, LPT is used in the context of 
the extended PS approach, as in Monaco (1995; 1997a, b) 
and Monaco & Murante (1999), to provide predictions for 
the collapse of fluid elements in a given numerical realisa- 
tion of a linear density field. Mass elements are assumed to 
have collapsed after undergoing orbit crossing. Such points 
are then grouped into halos using an algorithm that mimics 
the hierarchical growth of structure through accretion and 
mergers. The Zel'dovich approximation is used to compute 
the Eulerian positions of halos at a given time. Some points 
that have undergone orbit crossing are assigned to the net- 
work of filaments and sheets that connects the halos. Paper I 
contained a preliminary comparison to simulations, demon- 
strating that PINOCCHIO can accurately reproduce many 
properties of the DM halos from a large TV-body simulations 
that started from the same initial density field. The good 
agreement is not only for statistical quantities such as the 
mass or the correlation function, but extends to the object- 
by-object comparison. PINOCCHIO thus provides a signifi- 
cant improvement over the extended PS approach, which is 
known to be approximately valid only in a statistical sense 
(Bond et al. 1991; White 1996). 

In this paper, the PINOCCHIO code is described in 
more detail, focusing on some aspects that were neglected 
in paper I, in particular the validity of orbit crossing as def- 
inition of collapse, the ability to disentangle halos from the 
filament web, a complete description of the free parameters 
involved in the model, its validity at galactic scales, and 
its extension to predicting the angular momentum of DM 
halos. An accompanying paper (Taffoni, Monaco & Theuns 
2001) focuses on the ability of PINOCCHIO to recover the 
merging histories of DM halos. The paper is organised as 
follows. Section 2 presents the first step of PINOCCHIO, 
the prediction of collapse time for generic mass elements. 
This prediction is directly compared with the results of two 
different N-body simulations. Section 3 presents the second 
step of PINOCCHIO, the fragmentation algorithm, with at- 
tention to the ability of separating filaments from relaxed 
halos. In section 4 PINOCCHIO is applied to the initial 
conditions of the two simulations mentioned above. The re- 
sults of PINOCCHIO are compared with the N-body ones 
in terms of statistical quantities (mass and correlation func- 
tions), on a particle-by-particle basis (mass fields) and on 
an object-by-object basis (mass, position and velocity). In 
Section 5 PINOCCHIO is extended to predict the angular 
momentum of the DM halos. Section 6 discusses the relation 
of PINOCCHIO to previous analytic and semi-analytic ap- 
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proximations to the gravitational problem. Section 7 gives 
the conclusions. 



2 PREDICTING THE COLLAPSE TIME FROM 
ORBIT CROSSING 

2.1 The definition of collapse 

Linear theory is unable to treat the later stages of gravi- 
tational collapse, because the density grows at a constant 
rate and hence never becomes very high. Therefore, it is 
usually assumed that collapse takes place when the density 
contrast 5 = (p — p)/p reaches values ~ 1 (here p(t) is the 
density of the background cosmological model). In the spe- 
cial case of a spherical top-hat perturbation, a singularity 
(a region of infinite density) forms when the corresponding 
linear extrapolation of the density contrast reaches a value 
8 C ~ 1.686. It is usually argued that the formation of the sin- 
gularity corresponds to the formation of the corresponding 
DM halo. 

When more general cases than the spherical model are 
considered, the very definition of collapse becomes some- 
what arbitrary. Both in LPT and in the evolution of el- 
lipsoidal perturbations (White & Silk 1979; Monaco 1995, 
1997a; Bond & Myers 1996), collapse takes place along the 
three different directions defined by the eigen vectors of 
the deformation tensor, at three different times (see below). 
Therefore, several definitions of collapse have been proposed, 
one related to first-axis collapse (Bertschinger & Jain 1994; 
Monaco 1995; Kerscher, Buchert & Futamase 2000), and 
another related to third-axis collapse (Bond & Myers 1996; 
Audit et al. 1997; Lee & Shandarin 1998; Sheth et al. 2001). 
The difference between these two definitions is discussed in 
detail in Section 5.1. 

In the Lagrangian picture of fluid dynamics the Eulerian 
(comoving) position x of a fluid element (or equivalently of a 
mass particle) is related to the initial (Lagrangian) position 
q through the relation: 

x(q,t) =q + S( qj t), (1) 

where S(q, t) is the displacement field. The Euler-Poisson 
system of equations (see Padmanabhan 1993) can be recast 
into an equivalent set of equations for S (see, e.g., Catelan 
1995). LPT is a perturbative solution to that system of equa- 
tions, whose first-order term is the well-known Zel'dovich 
approximation (Zel'dovich 1970; Buchert 1992): 

Sa(q,t) = -&(*) p,«(q). ( 2 ) 

Here and below, commas denote differentiation with respect 
to q, b(t) is the linear growing mode, and y(q) is the rescaled 
peculiar gravitational potential, which obeys the Poisson 
equation: 

V 2 9(q) = %,ti)/Kti)^i(q), (3) 

where t; is an initial time at which linear theory holds. The 
quantity Si (q) does not depend on the initial time. It is called 
linear contrast, as it is equal to the linear extrapolation of 
the density contrast to the time defined by b(to) = 1, which 
can be taken to be the present time (i.e., b(z = 0) = 1). 

As the fluid element contains by construction a fixed 
but vanishingly small mass, its density can be written as the 



inverse of the Jacobian determinant of the transformation 
given in equation [j]: 

l + <5(q,t) ^detOr^r 1 = det + ~\ (4) 

(Here S^ b is the Kronecker symbol). When the Jacobian de- 
terminant vanishes, the density formally goes to infinity. 
This corresponds to the formation of a caustic, a process 
discussed in detail by Shandarin & Zel'dovich (1989). At 
this time, the transformation x — > q becomes multi-valued, 
and particle trajectories undergo orbit crossing (OC). 

Because the density becomes high at OC, we identify 
this moment as the collapse time (Monaco 1995; 1997a). In 
this way, collapse is well defined and easy to compute using 
LPT which remains valid up to that point but breaks down 
afterwards. We note that this definition corresponds to first- 
axis collapse as discussed at the beginning of this Section. 
This definition of collapse does not require the introduction 
of any free parameters. However, a drawback of this defini- 
tion is that it does not guarantee that the mass element is 
going to flow into a relaxed DM halo. Indeed, a fraction of 
particles that undergo OC remain in low density filaments 
instead of collapsed halos. 

The calculation of collapse times is presented in Monaco 
(1997a), to which we refer for all details. LPT converges in 
predicting the collapse time of a generic fluid element, as 
long as not more than 50 per cent of mass has collapsed. 
First-order LPT, i.e. the Zel'dovich approximation, is exact 
(up to OC) in the case of planar symmetry, but in the spher- 
ical limit, relevant for the collapse of high peaks, it overesti- 
mates the growing mode at collapse time by nearly a factor 
of two (the value of Si for spherical collapse is 3, compared 
with 1.686), while second-order LPT is ill-behaved in under 
densities. Thus third-order LPT must be used to calculate 
the collapse time of generic mass elements. 

The Lagrangian perturbative series can be truncated so 
as to resemble formally the collapse of an ellipsoid in an 
external shear field (see also Bond & Myers 1996a). When 
the peculiar gravitational potential (equation ^ is expanded 
into a Taylor series around a generic position (taken to be 
the origin of the q frame), the first term relevant for the 
deformation of the fluid element is the quadratic one, y(q) — 
y?,a6?a?!)/2. In the principal frame of ip t ab this can be written 
as: 

y(q) = ^( A i<7? + A 2<?2 + A 3 gf), (5) 

where Ai are the three eigenvalues of tp >a b- The initial con- 
ditions for the ellipsoid semi axes a,i at the initial time t% 
are: 

Oi = o(*i)(l-6(ti)Ai), (6) 

where a(t) is the scale factor. Note that at the initial time the 
ellipsoid is an infinitesimally perturbed sphere. With these 
initial conditions, the exact equations of ellipsoidal collapse 
can be integrated numerically (Monaco 1995; Bond & My- 
ers 1996a). However, it easier to solve exactly the third-order 
LPT equations in the ellipsoidal case of equation ^, as only 
first and second derivatives of the peculiar potential are re- 
tained. This LPT solution gives a very good approximation 
to the numerical integration in all cases with the exception of 
the spherical limit. A small numerical correction is sufficient 
to recover properly this limit; this is described in Appendix 
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B of Monaco (1997a). Apart from describing accurately the 
collapse of an ellipsoid, this solution gives a general approx- 
imation for the LPT evolution of a mass element under the 
action of gravity. This approximation, which will be denoted 
by ELL in the following, is easy to implement as it requires 
only the computation of the deformation tensor, while full 
third-order LPT requires the solution of many Poisson equa- 
tions, thereby introducing numerical noise. Moreover, 3rd- 
order LPT still under predicts the quasi-spherical collapse 
of the highest peaks (a simple correction, as in the ELL 
case, is not feasible in this case), and consequently also the 
high-mass tail of the mass function. In general, ELL is an 
adequate approximation to compute the OC collapse time 
of generic mass elements. 

In conclusion, it is worth stressing again that in this 
context ELL is purely a convenient truncation of LPT; no 
constraint is put on the shape of the collapsing objects, nor 
on the 'shape' of the mass elements (which is simply a mean- 
ingless concept). 



2.2 Testing OC as definition of collapse 

Before using OC as collapse prediction, it is necessary to 
decide whether LPT (and ELL in particular) is accurate 
enough to reproduce the OC-collapsed regions, and how 
these are related to the relaxed halos. This can be done 
by applying LPT to the initial conditions of a large TV-body 
simulation, and comparing the LPT OC regions to those 
computed by the simulation. 

For this and further comparisons we use two collision 
less simulations. The first, a standard CDM model (SCDM), 
has been performed with the PKDGRAV code, and consists 
of 360 3 (~46xl0 6 ) DM particles (Governato et al. 1999); 
it was also used in paper I. The second simulation has 
been performed with the Hydra code (Couchman, Thomas 
& Pearce 1995), and consists of 256 3 DM particles in a flat 
Universe with cosmological constant (ACDM). In order to 
test for resolution effects, the same simulation has been run 
with 128 3 particles, resampling the initial displacements on 
the coarser grid (we will refer to it as ACDM128). The main 
characteristics of the simulations are summarised in Table 1. 
These simulations allow us to test PINOCCHIO for differ- 
ent cosmologies, different resolutions, and different N-body 
codes, reaching a range of at least 5 orders of magnitude in 
mass with good statistics in terms of both numbers of halos 
and numbers of particles per halo. The PKDGRAV simu- 
lation samples a very large volume, making it suitable for 
testing the high mass tail of the mass function. The Hydra 
simulation samples a much smaller volume but at higher 
resolution, so we can test the power-law part of the mass 
function at small masses. Note that in all the simulations 
the particles are initially placed on a regular cubic grid. We 
have compared our results with another ACDM simulation 
performed with PKDGRAV, with the same box (in Mpc/h) 
and number of particles as the SCDM one. The compari- 
son confirms all the results given in this paper, and is not 
presented here. 

The predictions of collapse are performed as follows. 
The linear contrast Si is obtained from the initial displace- 
ments of the simulation using the relation (see equations ^ 
and[|: 



S a ,a(q,tO = -5i(q)b(ti). (7) 

For the SCDM simulation the displacements are first resam- 
pled on a 256 3 grid for computational ease. In this case, as 
well as throughout the paper, differentiations are performed 
with Fast Fourier Transforms (FFTs). This procedure al- 
lows one to recover the linear contrast with minimum noise 
and no bias. The linear contrast Si is then FFT-transformed 
and smoothed on many scales R with a Gaussian window 
function in Fourier space: 

W(kR) = exp (-k 2 R 2 /2) . (8) 

The smoothing radii are equally spaced in log R, except for 
the smallest smoothing radius which is set to in order 
to recover all the variance at the grid scale. The largest 
smoothing radius is set such that the variance of the linear 
density contrast cr(7? max ) = 1.686/6, making the collapse of 
a halo at this smoothing scale approximately a 6 a event. 
The smallest non-zero smoothing radius is set to a third 
of -Rmax- Because of the stability of Gaussian smoothing, 25 
smoothing radii in addition to R — give adequate sampling 
for a 256 3 realisation (we use 15+1 smoothing radii for 128 3 
grids). For each smoothing radius R the deformation tensor, 
</Ja,f)(q, R), is obtained in the Fourier space from the FFT- 
transformed, smoothed linear density contrast Si (k; R) as 
</3 ai f,(k; 7?) = —kakb/k 2 <5;(k;7?), and then transformed back 
to real space, again with FFT. Double precision is required 
in this calculation to obtain sufficiently accurate results. The 
ELL collapse times are computed for each grid point from 
the value of the deformation tensor as described in Section 
2.1 and Appendix B of Monaco (1997a). 

It is convenient to use the growing mode b(t) as time 
variable, because with this choice the dynamics of gravi- 
tational collapse is (almost) independent of the background 
cosmology (see, e.g., Monaco 1998). For this reason, in place 
of the collapse time t c we record the growing mode at col- 
lapse, 6 C = b(t c ). With the procedure outlined above, a col- 
lapse time is computed for each grid vertex q and for each 
smoothing radius R, i.e. b c = 6 c (q; R)- We define the inverse 
collapse time field F as: 

F(^R) = l/b c (^R). (9) 

In the case of linear theory F = Si/S c . The values of the F- 
field at a single point q correspond to the trajectories in the 
F — R plane (or equivalently the F — o 2 {R) plane) used in 
the excursion set approach to compute the mass function. In 
fact, as shown by Monaco (1997b), this quantity is obtained 
from the absorption rate of the F(R) trajectories by a bar- 
rier put at a level F c . As the smoothing filter is Gaussian, 
these trajectories are not random walks but are strongly cor- 
related. In general, the computation of the absorption rate 
requires no free parameter as long as the collapse condition 
does not. To solve the cloud-in-cloud problem, we record for 
each grid point the largest radius R c at which the inverse 
collapse time overtakes F c ; the grid point is assumed to be 
collapsed at all smaller scales. We call this radius _R c (q), the 
collapse radius field (Monaco & Murante 1999). R c depends 
on the height of the barrier as well as on time. 

The R c field for the simulations is obtained as follows. 
The displacement field S slm (i.e. the displacement of N- 
body particles from their initial position on the grid) is 
smoothed in the Lagrangian space q with the same set of 
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SCDM 


360 3 


500 


1.0 


0.0 


0.5 


0.5 


1.0 


1.49 x 10 12 


ACDM 


256 3 


100 


0.3 


0.7 


0.65 


0.195 


0.9 


7.64 x 10 9 


ACDM128 


128 3 


100 


0.3 


0.7 


0.65 


0.195 


0.9 


6.11 x 10 10 



Table 1. Simulations used for the analysis. 



smoothing radii. (Also here, we resample the large 360 3 sim- 
ulation to a 256 grid using nearest grid point interpola- 
tion.) Each smoothed field is differentiated using FFTs along 
the three spatial directions and the Jacobian determinant 
det(<5^ + S^f) is computed for each grid vertex. For each 
grid point, we again record the largest smoothing radius 
J?c' m (q) at which the Jacobian determinant first becomes 
negative (hence passing through 0). 

The R c field computed using LPT and obtained from 
the simulations at redshift z — are compared in figure |l| 
The two fields are remarkably similar, exhibiting the same 
structure of broad peaks, with the difference that the peaks 
of the simulation are lower, as anticipated by Monaco (1999). 
In figure ^| we show a more quantitative point-by-point com- 
parison between the two fields. For display purposes, some 
random noise has been added to the discrete values of R c ; 
in this way the values lie in squares instead of points. There 
is a reasonably tight correlation between the predicted and 
numerical collapse-radius fields, which confirms the power of 
LPT to predict the mildly non-linear evolution of perturba- 
tions; it is noteworthy that this comparison does not involve 
free parameters. The correlation is quantified by the well- 
known Spearman rank correlation coefficient rs and Pear- 
son's linear correlation coefficient rp, both reported in the 
panels of figure ^. (A high value of rs indicates the exis- 
tence of a relation with moderate scatter, a high value of rp 
indicates the existence of a good linear relation.) The coef- 
ficients take rather high values of ~ 0.8, confirming in an 
objective and quantitative way the correlation. 

However, as noted also in figure jj], the relation between 
the two R c fields is not unbiased: the simulated R c field is 
lower than the ELL one, especially at large J?-values. The 
cause of this behaviour can be understood as follows. LPT 
predicts that after OC particles do not remain bound to the 
caustic region but move away from it, in contrast to what 
happens in the simulations. Therefore, as in this analysis 
particles are not explicitly restricted to the pre-OC (sin- 
gle stream) regime, the displacements in the simulation are 
always smaller than those predicted by LPT. As a conse- 
quence, the collapse radius obtained by the smoothed dis- 
placements of the simulation is lower than that predicted by 
LPT. This bias disappears at small radii, which are however 
dominated by numerical noise. 

The difference between the LPT and simulation fields 
R c can also be quantified by the cumulative distribution 
of the R c fields as a function of R, or equivalently of the 
variance a 2 (R). We will denote this function by f2(< a 2 ), 
since it is also the fraction of mass collapsed on a scale > R 
where the rms is smaller than a. This quantity is used in 
the PS approach to obtain the mass function 
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The functions S7(< a 2 ) from ELL and the simulations 
are compared in figure ^. The LPT curves are by construc- 
tion independent of time and cosmology, so that only the 
z — LPT prediction in shown. In contrast, the S7 curves 
obtained from the simulations change with time. At late 
times, particles have crossed the structure they belong to 
many times and the numerical displacements differ more and 
more from the LPT ones. This is confirmed by the fact that 
the point of intersection between the Q(< a 2 ) obtained from 
LPT and simulation roughly scales as b(t) 2 . Most notably, 
the difference between predictions and simulations tends to 
vanish for the highest redshifts; in this case the particles 
have not had time to cross the structures, and their tra- 
jectories are very similar to the LPT ones. In all cases we 
notice that the numerical f2(< a 2 ) functions become larger 
than the LPT ones at the smallest, unsmoothed scales, espe- 
cially in the SCDM case and at higher redshift. This is most 
likely due to numerical noise present in the simulation, that 
enhances the level of non-linearity of the displacements, and 
in the SCDM case to the resampling from 360 3 to 256 3 grids. 

For comparison, we show in figure ^| also Q(< a 2 ) from 
linear theory with 8 C = 1.686, which falls short of both the 
ELL prediction and the simulations. We have verified that 
linear theory (with Gaussian smoothing!) misses the collapse 
of many mass points that belong to filaments or to low mass 
halos. Decreasing 8 C to 1.5 improves the agreement only at 
the largest masses, but does not solve the problem at small 
masses. The Zel'dovich approximation severely under pre- 
dicts S7(< a 2 ) at large masses, but approaches the ELL curve 
for lower mass (Monaco 1997a). Consequently, using either 
linear theory or the Zel'dovich approximation instead of el- 
lipsoidal collapse, would significantly decrease the accuracy 
of PINOCCHIO 

We have also computed the R c field using full 3rd-order 
LPT. With respect to ELL, the fraction of collapsed points 
increases at small scales R, but decrease at large radii, be- 
cause of the already mentioned inability of 3rd-order LPT to 
reproduce the spherical limit correctly (Monaco 1997a). We 
have verified that the correlation with the numerical R c field 
is noisier, and that the additional small-scale contribution 
of collapsed matter consists mainly of particles in filaments. 
Moreover, the computation is much more demanding than 
the ELL case. We conclude that there is no advantage in 
using the full 3rd-order LPT solution. 



2.3 R c and the simulated halos 

Having demonstrated the ability of LPT in predicting col- 
lapse in the OC sense (without free parameters), we need 
to decide whether OC may be of any use to predict which 
mass elements are going to end up in relaxed halos. In order 
to do so, we compute the 'mass field' from the simulation, 
which assigns to each grid vertex in the initial conditions, 
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Figure 1. Upper panels: collapse radius fields R c for a section of the Lagrangian space of the ACDM simulation at redshift z = 0. In 
the left panel we show the ELL prediction, and in the right panel the results from the simulation. Lower left panel: mass field for the 
same section; the mass field gives for each particle the mass of the halo it belongs to at z = 0. Ungrouped particles are assigned mass. 
Lower right panel: inverse collapse time -F ma x for the same section. 



the mass of the halo that the corresponding particle ends- 
up in. Halos have been identified in the simulation using a 
standard friends-of-friends (FOF) algorithm, with a linking 
length O.sFltimes the mean inter particle distance. The mass 



field is shown in figure |lp for the same slice of the ACDM 
simulation as the other panels. A FOF halo looks like a 
plateau, with the plateau's height giving the halo's mass. 
There is a broad agreement between the peaks in the R c and 



* The simulation halos were identified using a standard FOF al- 
gorithm with linking length 0.2, irrespective of cosmology. In this 
way, halos are defined above a fixed fraction of the mean den- 
sity - as opposed to above a fixed fraction of the critical density. 



Jenkins et al. (2001) showed that this makes the mass function 
almost universal with cosmology, and in addition it is similar to 
the definition used in PINOCCHIO. 
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Figure 2. Comparison of the collapse radius fields R c , as predicted by ELL, with the values found in the simulations, for a random 
sample of ~20000 points for the SCDM model (left panel), and the ACDM model (right panel). For clarity some random noise has been 
added to the discrete R c values, so that they lie on squares instead of points. 
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mass fields, because massive (low mass) objects are generally 
associated with large (small) smoothing radii. Consequently, 
there certainly is some connection between orbit crossed re- 
gions and relaxed halos. However, there are some important 
differences as well. 

Not all the FOF points fall within the boundaries of the 
R c contours. This fact was already addressed by Monaco & 
Murante (1999), and is expected because the OC criterion 
tends to miss those infalling particles that have not made 
their first crossing of the structure. In fact, strictly speak- 
ing those should not be counted as belonging to the relaxed 
halo anyway. For SCDM, the fraction of FOF particles not 
predicted to be OC-collapsed ranges from ~10 per cent at 
large masses to ~20 per cent at smaller masses; smaller val- 
ues are obtained for ACDM, where the fraction of collapsed 
mass is higher. This has a modest impact on the results, and 
is hardly noticeable in figure |l[ 

More importantly, the reverse is true as well: many 
particles assigned non-vanishing or even high R c values do 
not belong to a halo. These particles are in the moderately 
over dense filaments and sheets that connect the relaxed ha- 
los. These structures, although indeed in the multi-stream 
regime, are in a relaxation state very different from that of 
the halos. It is apparent that the removal of such sheets and 
filaments (hereafter referred to as filaments) is an important 
issue that needs to be addressed. 



2.4 Computing the collapse time 

Another feature apparent when comparing the mass and R c 
fields (figure [j]) is that many FOF halos may correspond to 
a single broad peak of R c . This makes the time-dependent 
R c field unsuitable for addressing the fragmentation of mat- 
ter into halos and filaments. It is more convenient to fol- 
low a procedure similar to the merging cell model (Ro- 
drigues & Thomas 1996; Lanzoni et al. 2000), i.e. record- 
ing for each mass point the largest F-value it reaches, or, 
in other words, the highest redshift at which the point is 
predicted to collapses in the OC sense (for SCDM it is sim- 
ply F — (1 + z c ), where z c is the collapse redshift). This 
is another way to solve the so-called cloud-in-cloud problem 
(Bond et al. 1991): a point that collapses at some redshift is 
assumed to be collapsed at all lower redshifts. We therefore 
record the following quantity: 

F m ax(q) = max[F(q; 7?)]. (11) 

Together with F max we also store for each point the smooth- 
ing radius 7? max at which F — Fmax, and the correspond- 
ing Zel'dovich velocity v max computed at the time b(t) = 
1/Finax appropriate for the smoothing radius i? m ax- 

In contrast to R c , the inverse collapse time F max evi- 
dently does not depend on time, while it does depend on 
the smoothing radius. The excursion set of those points 
where F max is greater than some level F c gives the mass that 
has collapsed before the time t c that corresponds to F c , at 
the highest resolution on the grid (i.e. without smoothing, 
R — 0). The lower right panel of figure [l] plots the F max 
field for the same section as the other panels. Within each 
large object identified in the mass field, F c has many small 
peaks that correspond to objects forming at higher redshifts. 
These peaks are modulated by modes on a larger scale that 



follow the excursions of the R c field. Those large scale mod- 
ulations are ultimately responsible for the later merging of 
these small peaks into the massive object identified at late 
times. In this way, PINOCCHIO combines the information 
on the progenitors to reconstruct the merger history of ob- 
jects, as described in detail in the next section. 



3 IDENTIFICATION AND MERGER HISTORY 
OF HALOS 

In the PS and excursion set approaches the mass of the 
objects that form at a scale R is simply estimated as 

M ~ ^-pR 3 . (12) 

A more detailed treatment of the complex processes that de- 
termine the shape of the Lagrangian region to collapse into a 
single halo is required to get an improved description of the 
formation of the objects, and thus an improved agreement 
with simulations at the object- by- object level. In PINOC- 
CHIO, this is done by generating realisations of the density 
field on a regular grid, computing the F max field as explained 
above, and then 'fragmenting' the collapsed medium into 
halos and filaments by considering the fate of each particle 
separately. To enable a detailed comparison with the simu- 
lation, we will perform these steps on the initial conditions 
of the runs. Therefore, we can compare the properties of 
individual halos between the simulations and PINOCCHIO, 
not just the statistics of halos. Of course. PINOCCHIO can 
be applied to any realisation of a density field, including 
non-cubic grids and non-Gaussian perturbation fields. The 
fragmentation algorithm can even be applied to non-regular 
and non-periodic grids, if the FFT-based calculation of F max 
is suitably modified. 

The fragmentation code mimics the two main processes 
of hierarchical clustering, that is the accretion of mass onto 
halos and the merging of halos. The particles of the reali- 
sation are considered in order of descending F ma x-value, i.e. 
in chronological order of collapse. At a given time the par- 
ticles that have already collapsed will be either assigned to 
a specific halo, or associated with filaments. Because of the 
continuity of the transformation between Lagrangian and 
Eulerian coordinates, equation [j], a particle must touch a 
halo in the Lagrangian space if it is to accrete on it.|T] Thus 
a collapsing particle can accrete only onto those halos that 
are 'touched' by it, i.e. that already contain one of its 6 near- 
est neighbours in the Lagrangian space of initial conditions 
(we call these particles Lagrangian neighbours). To decide 
whether the particle does accrete onto a touching halo, we 
displace it to the Eulerian space according to its v max veloc- 
ity. The halo is displaced to its Eulerian position at the time 
of accretion, using the average velocity of all its constituent 
particles^. In the following we express sizes and distances 

t Here it is assumed that a particle that accretes onto a halo 
never escapes back in to the field. Such stripping does occasionally 
happen in simulations, but not very often and we neglect it. 
t Thus the velocity of a halo is an average over velocities calcu- 
lated at different smoothing radii. A better estimate (but expen- 
sive in terms of computer memory) would be to average the un- 
smoothed velocities over the particles of the halo. Fortunately, the 
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Figure 4. Final positions of particles at z = from a slice of the initial conditions for the SCDM model (plot a) and the ACDM model 
(plot b). In each plot, the left panels show those particles that are in filaments (i.e. that have undergone OC but are not assigned to a 
halo), the right panels show particles that are assigned to halos. Upper panels are obtained from the simulations, lower panels refer to 
PINOCCHIO. The large visual difference between the two cosmologies is mostly due to the very different box size used. 



in terms of the grid spacing. The size Rn of a halo of iV 
particles is taken to be 

R N = N 1/[i . (13) 

The collapsing particle is assumed to accrete onto the halo, 
if the Eulerian (comoving) distance d between particle and 
halo is smaller than a fraction of the halo's size Rn 

d < / a x R N . (14) 

The free parameter / a , which is smaller than one, controls 
the over density that the halo reaches in the Eulerian space, 
1 + <5haio ~ 3/47r/f . Therefore, this criterion selects halos 
at a given over density, making it similar to the usual FOF 
or similar selection criteria. The value of the / a parameter 
is fixed in Appendix A to ~ 0.25. Then, the halos reach a 
much lower over density than the value ~ 200 used in simu- 
lations; Zel'dovich velocities (and LPT velocities in general) 
are not accurate enough to reproduce such high densities for 
the relaxed halos. However, PINOCCHIO only attempts to 



stability of the velocity to smoothing makes the two estimates 
very similar, once the average is performed over many particles. 



identify the halos, not compute their internal density profile 
as well. 

When a collapsing particle touches two (or more) halos 
in the Lagrangian space, then we use the following criterion 
to decide whether the two halos should merge. We com- 
pute the Eulerian distance d between the two halos at the 
suspected merger time using the halo velocities described 
above. The halos are deemed to merge when d is smaller 
than a fraction of the Lagrangian radius of the larger halo: 

d < / m x max(i?Ni, -Rn2)- (15) 

This condition amounts to requiring that the centre of mass 
of the smaller halo, say halo 2, is within a distance f m RNi 
of the centre of mass of the larger halo 1. The value of the 
/ m parameter is fixed in Appendix A to 0.35. 

We note that PINOCCHIO is not restricted to binary 
mergers. In principle, a particle has 6 Lagrangian neighbours 
so up to 6 halos may merge at the same time. In practice 
binary mergers are the most frequent, but ternary mergers 
also occur, while mergers of four halos or more are rare. 

In more detail, the fragmentation code works as fol- 
lows. We keep track of halo (or filament) assignment for all 
particles. For each collapsing particle we consider the halo 
assignment of all Lagrangian neighbours; touching halos are 
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Figure 5. Mass functions for the ACDM model at different redshifts indicated in the panel. Error bars denote Poissonian errors for 
the simulated mass function, continuous lines are the PINOCCHIO predictions, dotted and dashed lines are the PS and ST predictions, 
respectively. 



those to which a Lagrangian neighbour has been assigned. 
The following cases are considered: 

(i) If none of the neighbours have collapsed, then the 
particle is a local maximum of ,F max . This particle is a seed 
for a new halo of unit mass, created at the particle's position. 

(ii) If the particle touches only one halo, then the accre- 
tion condition is checked. If it is satisfied, then the particle 
is added to the halo, otherwise it is marked as belonging to a 
filament. The particles that only touch filaments are marked 
as filaments as well. 



(iii) If the particle touches more than one halo, then 
the merging condition is checked for all the touching halo 
pairs, and the pairs that satisfy the conditions are merged 
together. The accretion condition for the particle is checked 
for all the touching halos both before and after merging 
(when necessary). If the particle can accrete to both halos, 
but the halos do not merge, then we assign it to that halo 
for which d/R^ is the smaller. Occasionally, particles fail to 
accrete even though the halos merge. 
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Figure 6. Eulerian correlation function (upper panels) and Lagrangian correlation function (lower panels) for the ACDM models at 
three redshifts indicated in the panel, and for two mass ranges. Symbols refer to simulation results, lines to PINOCCHIO predictions. 
Filled squares and continuous lines: correlation function for low mass halos (mass M from 6.3 x 10 11 to 3 x 10 12 Mq), open squares and 
dashed lines: correlation function for massive halos (M > 3 x 10 12 Mq). 



(iv) When a particle is accreted onto a halo, all filament 
particles that neighbour it are accreted as well. This is done 
in order to mimic the accretion of filaments onto the halos. 
Notice that up to 5 filament particles can flow into a halo 
at each accretion event. 

This fragmentation code runs extremely quickly, in a 
time almost linearly proportional to the number of particles. 
At late times, slightly more time is spent in updating the 
halo assignment lists in case of mergers, but this does not 
slow down the code much. 

In high density regions where most of the matter has 
collapsed, it can happen that pairs of halos that are able 
to merge are not touched by newly collapsing particles for 
a long time. This problem can be solved by keeping track 
of all the pairs of touching halos that have not merged yet, 
and checking the merging condition explicitly at some time 
intervals. Such a check slows the code down significantly, and 
has only a moderate impact on the results when the fraction 
of collapsed mass at the grid scale is large. Similarly, the 
accretion of filament particles on to halos can be checked at 
some given time intervals, but again, the impact is modest 



on the results but the increase in computer time may be 
substantial 

While the dynamical estimate of collapse time does 
not introduce any free parameter, the fragmentation pro- 
cess does. The same happens in the simulation, where any 
halo-finding algorithm has at least one free parameter, such 
as the linking length for FOF halos. This is because the 
definition of what constitutes a DM halo is somewhat arbi- 
trary, and hence also the corresponding mass function is not 
unique (Monaco 1999). Fortunately, different clump-finding 
algorithms usually give similar results, so that this ambigu- 
ity is in general not a real problem. In the following, the best 
fit parameters for PINOCCHIO will be chosen so as to repro- 
duce the mass function of the FOF halos of the simulations, 
with linking length equal to 0.2 times the inter-particle dis- 
tance, at many redshifts. We have checked with one SCDM 
output that the differences in the halos as defined by the 
HOP (Eisenstein & Hut 1998) and SO (Lacey & Cole 1994) 
algorithms are much smaller than the accuracy with which 
we are able to recover the FOF halos. 

The five free parameters of the fragmentation code, and 
the determination of their best-fit values, are described in 
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Figure 7. Count-in-cell analysis of the halo catalogues at z = 0. 
Left and right panels show results for the mass ranges indicated. 
Symbols refer to simulation results, lines to PINOCCHIO predic- 
tions. Continuous, dotted and dashed lines (or squares, stars and 
circles) refer to cell sizes of 2, 5 and 10 Mpc (1.3, 3.25 and 6.5 
Mpc/h). 

Appendix A. Note that the results shown paper I use a more 
limited set of three free parameters, which is adequate to 
describe large- volume realisations such as that of the SCDM 
simulation, while the larger set of parameters described in 
Appendix A has a more general validity. 

The ability of PINOCCHIO to distinguish OC particles 
that collapse into halos versus those that remain in fila- 
ments, is shown in figure ^. In this figure we plot the final 
position of the particles, as given by the simulation output 
at redshift 2 = 0, for a section of the initial conditions of the 
SCDM and ACDM simulations. Left panels show only the 
filament particles, defined as those which are in OC accord- 
ing to R c but do not belong to any halo. Right panels show 
only those particles that are in halos. Upper panels show 
the result from the simulation, lower panels the PINOC- 
CHIO predictions. Clearly, PINOCCHIO is able to distin- 
guish accurately halos from filaments, even though some fil- 
ament particles are interpreted as halo particles and vice 
versa. When compared with figures 6 and 7 of Bond et al. 
(1991), figure ^ shows the marked improvement of PINOC- 
CHIO with respect to the extended PS approach. We want 
to stress that filaments are important in their own right. 
For example, most of the Lyman-Q absorption lines seen in 
the spectra of distant quasars are produced in filaments (e.g. 
Theuns et al. 1998), so it will be useful to be able to generate 
catalogues of halos and filaments. 



4.1 Statistical comparison 

The comparisons of PINOCCHIO and FOF mass and cor- 
relation functions for the SCDM simulation were presented 
in paper I, using the more limited set of three free param- 
eters. The results with the full five-parameter set are very 
similar and are not shown here. In figure ^, we compare 
the mass function computed using PINOCCHIO and the 
ACDM A-body simulation. The FOF halos were identified 
as explained above. For reference, we also plotted the PS 
and Sheth & Tormen (1999, hereafter ST) mass functions. 
The choice of parameters reported in Appendix A produces 
a PINOCCHIO mass function which falls to within ~5 per 
cent of the simulated one from z = 5 to z — 0, for all 
mass bins with more than 30-50 particles per halo and for 
which the Poisson error bars are small. The only residual 
systematic is a modest, ~ 10-20 per cent underestimate at 
the highest-mass bins and highest redshift. An accuracy of 
better than 10 per cent on the mass function for a given 
realisation is perfectly adequate for most applications, as it 
is usually smaller than the typical sample variance as well 
as the intrinsic accuracy of ~ 20 — 30 per cent with which 
the mass function of N-body simulations is defined. Because 
PINOCCHIO is calculated for the same initial conditions as 
the simulation, Poisson error bars are not the correct errors 
to use for this comparison (notice that the Poisson error 
bars of the PINOCCHIO mass function are obviously very 
similar to those of the numerical one). We show them both 
for comparison with PS and ST and to understand which 
mass bins are affected by small number statistics. 

Taking the ST mass function (or the analytic fit of Jenk- 
ins et al. 2001) as a bona fide estimate, we have checked the 
validity of PINOCCHIO in reproducing the mass function 
of halos in a wide variety of cosmologies and box sizes (see 
Appendix A). The fit of the mass function is found to be still 
good even for halo masses as small as 10 5 M (ACDM cos- 
mology), at a redshift high enough to avoid that the whole 
box goes non-linear. 

Strictly speaking the agreement between the mass func- 
tions of PINOCCHIO and the simulated ones is not a proper 
comparison of prediction with numerical experiment, as the 
fit is achieved by tuning the free parameters discussed in 
Section 3. However, the very existence of a limited set of 
parameters that allows to achieve such a good agreement in 
different cases (SCDM and ACDM, PKDGRAV and Hydra, 
small and large boxes) is a very important result. As shown 
also in figure pL PINOCCHIO improves the fit with respect 
to PS, giving an accuracy very similar to the ST fit. Jenk- 
ins et al. (2001) showed that the ST fit underestimates the 
knee of the FOF mass function by ~ 10-20 per cent^J; we 
have verified that when this difference is evident the best fit 
PINOCCHIO mass function is more similar to the numeri- 
cal one and to the Jenkins et al. (2001) fit than to the ST 
mass function. This is evident in figure 1 of paper I (where 
the residuals of the z — mass functions are shown), but 
is hardly noticeable in figure ^, where Poisson errorbars are 
larger. The comparison with the ACDM simulation shows 
that the fit is very good also down to the low mass tail 



DETAILED COMPARISON TO 

SIMULATIONS Sheth & Tormen (2001) show that a modest tuning of their 

parameters can remove this disagreement. 
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Figure 8. Comparison of the mass fields for FOF halos identified in the ACDM simulation with that obtained from the R c field using 
the PS mass-radius relation (equation left panel) and using PINOCCHIO (right panel). For clarity some random noise has been 
added to the mass field values, especially to that obtained from the discrete R c field. 



M ~ 10 11 Mq or M/M* ~ 10" 2 (M» denotes the character- 
istic mass of the PS mass function, such that <t 2 (M„) = 5 2 ; 
see, e.g., Monaco 1998). 

In the PS and excursion set approaches, the mass func- 
tion is 'universal' when expressed in terms of the variable 
0(< a 2 ) already defined in Section 2.2 (equation [u^), which 
in this case gives the fraction of mass collapsed into objects 
larger than M (a 2 ) (with the mass given by equation |l2| ). 
The mass functions obtained from a large set of numerical 
simulations is indeed found to be universal to within ~30 per 
cent (Jenkins et al. 2001). The PINOCCHIO mass function 
is not by construction universal, yet we find it to be nearly 
universal once the resolution effects described in Appendix 
A are taken into account. 

However, the mass function of the Governato et al. 
(1999) SCDM simulation used here shows an excess of mas- 
sive halos at high redshift. This was already noticed by Gov- 
ernato et al., and quantified as a drift of the S c parameter 
from ~1.5 at high redshift to ~1.6 at z — 0. This trend is not 
confirmed by other simulations (Jenkins et al. 2001), nor by 
our ACDM simulation presented here. We find that PINOC- 
CHIO reproduces the weak trend of Governato et al. (1999) 
in the SCDM simulation, but also the lack of such a trend 
in the ACDM one. We conclude therefore that this effect is 
likely to be linked to the initial conditions generator, which 
is different for the two realisations (see Appendix A for more 



details). Recall that the PINOCCHIO mass functions refer 
to the same initial conditions as were use to perform the 
simulations. 

In figure we show the correlation function of halos 
as a function of mass, both in Eulerian and in Lagrangian 
space. The correlation function has been computed using a 
standard pair counting algorithm. The agreement between 
PINOCCHIO and the simulation is very good down to scales 
of a few grid cells, i.e. ~l-2 comoving Mpc/h (larger for 
rarer objects), below which the PINOCCHIO correlation 
functions become negative. This is in agreement with what 
found in paper I for the SCDM simulation. The differences 
are of order ~ 10-20 per cent in amplitude and ^10 per cent 
in terms of scale at which a fixed amplitude is reached. 
This means that both the correlation length ro, at which 
£(Vo) = 1, and the length at which £ = are reproduced 
with an accuracy of better than 10 per cent. This is an im- 
provement with respect to the ST formalism, where the ac- 
curacy is of order ~20 per cent (Colberg et al. 2001). More 
importantly, the trends of increased correlation for the more 
massive halos, or for halos of a given mass with increasing 
redshift, are both well reproduced. The correlation functions 
in the Lagrangian space are noisier, and are reproduced with 
somewhat larger error, especially at z = where they are 
slightly overestimated; however this error does not seem to 
propagate to the Eulerian correlation functions. 
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The two-point correlation function gives only a low- 
order statistics of the spatial distribution of a set of ob- 
jects. To probe the accuracy of the PINOCCHIO results at 
higher orders, we have performed a count-in-cell analysis of 
the halo distribution, which, at variance with the correlation 
function, depends also on the phases of the space distribu- 
tion of the halos. This is shown in figure [?] for galactic-sized 
(1O 12 M < M < 10 13 M Q ) and group-sized (M > 1O 13 M ) 
halos of the ACDM realisation, and cell sizes of 2, 5 and 
10 Mpc (corresponding to 1.3, 3.25 and 6.5 Mpc/h). The 
count-in-cells curves are well reproduced by PINOCCHIO, 
although their skewness is slightly underestimated, espe- 
cially for larger cells and smaller masses. In particular, the 
void probability Po of finding no halos in the cell is repro- 
duced with an accuracy no worse that a few percent when 
it takes values in excess of 0.6. 



4.2 Point-by-point and object-by-object 
comparison 

The PINOCCHIO approach is not just limited to making ac- 
curate predictions for statistical quantities such as the mass 
and correlation functions, but is also able to predict halo 
properties that correspond in detail to those obtained from 
simulations. This is in contrast to the PS approach, where 
the object-by-object agreement is very poor (White 1996, 
but see Sheth et al. 2001 for a different view). 

Agreement at the 'point- by- point level' requires that 
each particle is predicted to reside in the correct halo with 
the correct mass. Whether this agreement holds can be 
checked by comparing the mass fields already defined in sec- 
tion 2.2 (an example of which is shown in figure |l]c). We 
note that this type of analysis is similar to that of Sec- 
tion 2.2, where the point-by-point agreement was checked 
for the R c fields. In the PS approach, the mass of the halo 
to which a particle belongs is estimated as in equation |l2[ 
with the 47r/3 valid for top-hat smoothing (or sometimes 
left as a free parameter). In this case the mass field is sim- 
ply related to the R c field. A comparison between the mass 
fields obtained from the same R c field of figure [| (with ar- 
bitrary normalisation) and that of the simulation, Mfof, 
reveals only a poor correlation, as shown shown in figure H 
(left panel) for a random sample of ~20000 particles ex- 
tracted from the ACDM simulation. The tightness of the 
correlation is again quantified by the rs and rp coefficients. 
This figure is similar to figure 8 of White (1996) and figure 
2 of Sheth et al. (2001), with the difference that here the 
R c curve were computed with ELL instead of with linear 
theory (and Gaussian smoothing instead of top-hat). The 
point-by-point agreement is much better with PINOCCHIO 
(right hand panel), where the linear correlation coefficients 
rp jumps from 0.42 to 0.69, demonstrating the increase in 
accuracy. Clearly, the improvement of PINOCCHIO in the 
point-by-point comparison is not primarily due to the more 
accurate dynamical description of collapse. Rather it is due 
to the much more accurate description of the shape of the 
collapsing region, which is not restricted to the simple PS 
relation of equation |l2} 

While the linear correlation coefficients improves signif- 
icantly going from R c to the PINOCCHIO mass field, the 
Spearman correlation coefficient rs does not change much, 
since both panels contain a large number of outliers. These 
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Figure 9. Comparison on an object-by-object level of halos iden- 
tified by PINOCCHIO and found in the ACDM simulation, using 
a variety of statistics. Continuous, dotted, short-dashed and long- 
dashed lines refer respectively to redshifts z = 0, 1, 2 and 4. Top 
panel: fraction / c i of cleanly assigned objects; middle panel: frac- 
tion / S piit of non-cleanly assigned objects; bottom panel: average 
overlap f ov for cleanly assigned objects. The vertical lines in the 
top panel indicate halos with 10, 100, 10 3 , 10 4 and 10 5 particles 
(heavy lines) or 50, 500, 5xl0 3 and 5xl0 4 particles (light lines). 
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Figure 10. Difference in mass, position and velocity, logM, x\ and vi respectively, as estimated by PINOCCHIO and found from 
the simulation, for cleanly assigned halos. The scatter around the mean is plotted below each panel. The lower right panels show for 
comparison the displacement of halos according to the simulation. The first set of panels refer to the SCDM simulation, the second to 
the ACDM one 
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Figure 10. (continued) 



© 0000 RAS, MNRAS 000, 000-000 



PINOCCHIO 17 



are particles that lie at the border of halos, and are assigned 
to a halo by the simulation but not by PINOCCHIO, or vice 
versa. Such outliers are expected whenever the boundaries 
of halos in the Lagrangian space are not perfectly recovered. 
On the other hand the presence of such outliers is not very 
important when the catalogue of objects is considered. 

We next investigate the agreement of PINOCCHIO 
with the simulations at the object-by-object level, a coarser 
level of agreement but more relevant in practice. The degree 
of matching between halo catalogues is quantified as in pa- 
per I. For each object of one catalogue, the objects of the 
other catalogue that overlap for at least 30 per cent of the 
Lagrangian volume are considered. Two halos from differ- 
ent catalogues are 'cleanly assigned' to each other, when 
each overlaps the other more than any other halo. The frac- 
tion of halos not cleanly assigned is / sp j;t. The remainder 
l—/ci— /split is the fraction of objects of one catalogue that 
does not overlap with any halo in the other catalogue. These 
fractions quantify the level to which two catalogues describe 
the same set of halos. Another useful quantity is f ov , the 
average fraction that halos overlap when they are cleanly 
assigned. All these estimators depend on whether PINOC- 
CHIO is compared with simulations or vice-versa, but in 
general that difference is small as long as the comparison is 
good. 

In figure ^| we show the values of these three indicators 
of the agreement between the two halo catalogues as a func- 
tion of halo mass, for ACDM model; the SCDM case was 
shown in paper I. The agreement is very good at higher red- 
shift with ~80-90 per cent of objects cleanly assigned when 
the halos have at least 50 particles. The degree of splitting 
is only ^5 per cent, while the average overlap of cleanly- 
assigned objects / ov ranges from 60 per cent to 70 per cent 
nearly independent of mass and encouragingly larger than 
the 30 per cent lower limit. These results are in agreement 
with the SCDM ones presented in paper I. The agreement 
is slightly worse at lower redshift, with / c i<^70 per cent for 
halos with at least 100 particles, and a /spiit^-lO per cent. 
Within perturbative approaches there is obviously no advan- 
tage in going to higher resolution, as the accuracy of LPT 
worsens with the degree of non-linearity (see figure <\1) and 
with it all the results. Anyway, the agreement is still very 
significant for the last output, with a high fraction of cleanly 
assigned objects and a modest degree of splitting. In any case 
the results always improve with increasing number of parti- 
cles. Monaco (1997a) estimated that LPT would break down 
when ~50 per cent of the mass has undergone OC. There- 
fore, the agreement shown in figure |^ (and also in figure ^|) 
is better than expected. 

In figure [l^ we show the accuracy with which PINOC- 
CHIO is able to estimate mass, Eulerian position and veloc- 
ity of the cleanly assigned objects. In particular, we show 
both for SCDM and ACDM the scatter plots of the masses, 
and of velocity and position along one coordinate axis. For 
comparison, the scatter plot of the displacements of FOF 
halos from the initial to the final positions are shown as 
well. Masses are recovered with an accuracy of ~30 per cent 
for SCDM and ~40 per cent for ACDM, nearly independent 
of mass. The average value is slightly biased, which results 
from our constraint in reproducing the mass function. Posi- 
tions are recovered with a ID accuracy of ~1 Mpc, slightly 
depending on the box size and much smaller than the typ- 



ical displacements, while velocities are recovered with a ID 
accuracy of ~150 or 100 km/s for SCDM or ACDM. In gen- 
eral, the velocities of the fastest moving halos are underes- 
timated. This could be fixed by extending the calculation 
of velocities to third order LPT, although a straightforward 
extension has been found not to work. 

The comprehensive analysis of the statistical properties 
of the PINOCCHIO halos and their merger histories, pre- 
sented here and in Taffoni et al. (2001), demonstrate that 
the statistical properties of halos are always well reproduced 
for > 30 — 50 particles, so that the degrading of quality of 
the object-by-object comparison with time (non-linearity) is 
due to random noise, which does not induce significant sys- 
tematics and thus does not hamper the validity of the halo 
catalogues. 

We stress that these comparisons are pure predictions 
of PINOCCHIO, in the sense that the free parameters of 
the method are constrained by the z = mass function 
alone. The good agreement with the numerical simulations 
confirms that PINOCCHIO is a successful approximation 
to the gravitational collapse problem in a cosmological and 
hierarchical context. 



4.3 Resolution effects 

As discussed above, PINOCCHIO halos resemble the FOF 
ones closely if they possess a minimum number of particles 
of around 30-100. Statistical quantities are well reproduced 
for halos with at least 30-50 particles. These limits are com- 
fortably similar to the minimum number of particles needed 
by a simulation to produce reliable halos. To show this we 
plot in figure [H], for a random set of particles, the mass 
fields (i.e. the mass of the halo the particle belongs to) as 
determined by the 128 3 or 256 3 ACDM runs, both for the 
simulations and for PINOCCHIO. The result is shown at 
z = 0. There is considerable scatter between the masses of 
the halos determined from simulations with different reso- 
lutions. This scatter is less than between PINOCCHIO and 
simulations, but not by much. This result is similar at higher 
redshifts. More details are given in Appendix B, where it is 
shown that the match of the ACDM and ACDM128 halo cat- 
alogues shows a drop in the number of cleanly assignments 
for halos smaller than ~30 particles (figure Bla), very sim- 
ilar to that shown in figure bl 

This results suggests that resolution affects PINOC- 
CHIO in a similar way as it affects numerical simulations. 
Better resolution leads to increased scatter in the identifica- 
tion of halos, since the structures become more non-linear. 
For instance, we have verified that more massive halos are 
reconstructed slightly better by the 128 3 PINOCCHIO run 
than by the 256 3 one. This is because at higher resolution, 
PINOCCHIO may decide to break-up a more massive halo 
in two. The degrading of the quality is modest and amounts 
to increased random noise which does not bias significantly 
the statistics of the halos. 



5 ANGULAR MOMENTUM OF THE DM 
HALOS 

Halos are thought to acquire their angular momentum from 
tidal torques exerted by the large-scale shear field while they 
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Figure 11. The effects of numerical resolution for simulations and for PINOCCHIO. Halo masses for a random set of particles from the 
256 3 ACDM realisation as determined from the simulation and by PINOCCHIO (left and right panels respectively) are compared with 
the masses from the ACDM128 simulation. 



are still in the mildly non-linear regime (Hoyle 1949; Pee- 
bles 1969; White 1984; Barnes & Efstathiou 1987; Heavens 
& Peacock 1988). In this hypothesis it is possible to esti- 
mate the angular momentum of halos using the Zel'dovich 
(1970) approximation or higher-order LPT (Catelan & The- 
uns 1996a,b). The biggest difficulty in this calculation is to 
identify the Lagrangian patch that is going to become a halo. 

However, it was recently shown by Porciani, Hoffmann 
& Dekel (2001a,b) that the Zel'dovich approximation is un- 
able to give very accurate predictions of the spin of halos, 
as the highly non-linear interactions of neighbouring halos 
tend to randomize their spins. Assuming to know exactly 
which particles are going to flow into a halo at z = and 
using the Zeld'ovich approximation to compute the large- 
scale shear field, Porciani et al. (2001a) were able to recover 
the final angular momentum of the DM halos with an aver- 
age alignment angle (defined as the angle between true and 
reconstructed spins) of no better than ~40°. 

Their analysis highlights the difficulty in predicting a 
higher-order quantity such as the spin of DM halos. The 
same calculation of spin with N-body simulations is subject 
to debate. Comparing our ACDM and ACDM128 simula- 
tions, we show in Appendix B that for an order-of-magnitude 
estimation of angular momentum at least 100 particles per 
groups are required, while a more robust estimation requires 
at least ten times more particles. This is at variance with 
other quantities, such as halo mass and velocity, that con- 
verge more rapidly. In the following we will restrict our anal- 
ysis to groups larger than 100 particles. 

With respect to the analysis of Porciani et al (2001a), 
the PINOCCHIO code presents the advantage of predicting 



with good accuracy the instant at which particles get into 
the halo, while the actual shape of the halo in the Lagrangian 
space is recovered with some noise, especially in the external 
borders that in fact contribute most to the angular momen- 
tum. We have verified that the direction of the largest axis 
of the inertia tensor of the halos in the Lagrangian space is 
recovered within an alignment angle of ~20°, while elliptic- 
ity and prolateness are correctly reproduced, although with 
much scatter. 

The estimate of the angular momentum of halos is easily 
performed within the fragmentation code, with negligible 
impact on its speed. When two halos with angular momenta 
Li and L2 merge, the spin L mcrg of the merger is estimated 
as: 

L morg = Ll + L2 + L or b, (16) 

where L orb is the orbital angular momentum of the two ha- 
los: 

Lorb = Mi(Aqi x Avi) + M 2 (Aq 2 x Av 2 ). (17) 

Here Aq^ = q* - q cm , Av^ svj- v cm , with i = 1,2, q cm 
and v cm the position and velocity of the centre of mass. 
It is worth noticing that the use of Lagrangian coordinates 
q is justified by the parallelism of displacements and veloci- 
ties. Following Catelan & Theuns (1997a), we stop the linear 
growth of velocities not at the time of merger t mC r g c but at 
the time i gr0 w defined as: 

tgrow = 0.5t m erge (-^) 

where t is physical time. This is a suitable generalisation 
of the concept of 'detaching' of the perturbation from the 
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Figure 12. Mass— spin relation for dark-matter halos. Contour 
lines trace the levels of 0.2, 1, 2 and 4 halos per decade in logM 
(Mq) and log L/M (km/s Mpc, physical units). Continuous and 
dotted lines show the contours for FOF and PINOCCHIO halos 
respectively. Dashed lines give the scaling L oc M 5 / 3 (Catelan 
&; Theuns 1996a). Upper panels: z = 0; lower panels: z = 3. 
Left panels: no correction; right panels: spin corrected as in equa- 
tion hgl 

Hubble flow. The case of accretion is treated as a merger 
with a 1-particle halo which carries zero spin. 

The so-obtained angular momenta obey a mass-spin 
relation which is roughly consistent with that of the FOF 
groups. This is shown in the left panels of figure [j^ for 
the ACDM simulation. Although qualitatively similar, the 
PINOCCHIO relation overestimates the FOF one by some 
factor which is larger for the smaller halos. If the lower value 
of the spin is due to the higher degree of non-linear shuffling 
suffered by halos because of tidal interaction with neigh- 
bours, this trend of having lower-mass halos more random- 
ized than higher-mass ones is in agreement with that sug- 
gested by Porciani et al. (2001a). 

It is useful to improve this prediction, so as to obtain an- 
gular momenta for the halos with accurate statistical prop- 
erties. To this aim we decrease each component of the spin 
at random, following the simple rule: 

Lf W = U X ((1 ^ /spin) + /spin X / rand ), (19) 

where / spin = fo + /i(M/M,(z)) (forced to < / spi „ < 1) 
and /rand is a random number (0 < / ran d < 1). The two 
parameters fo and /i are fixed so as to reproduce at best 
the mass-spin relation of figure |l^. Optimal values are fo = 
0.8 and /i = 0.15. The right panels of figure |l| show the 
resulting mass-spin relations, which agrees fairly well with 
the FOF ones. 

Apart from the mass-spin correlation shown in fig- 
ure hA the angular momentum is known to be nearly inde- 
pendent of other halo properties (Ueda et al. 1994; Cole & 



Lacey 1996; Nagashima & Gouda 1998; Lemson & Kauffman 
1999; Bullock et al. 2001; Gardner 2001; Antonuccio-Delogu 
et al. 2001), with the exception of a weak dependence with 
the merger history of the halos. The dependence of spin on 
the environment is still debated (Lemson & Kauffman 1999; 
Antonuccio-Delogu et al. 2001). Gardner (2001) has shown 
that halos that have suffered a major merger tend to have 
higher spin. In figure [l^ we show that this trend is suc- 
cessfully reproduced by PINOCCHIO halos. Merged halos 
at z — have been selected by requiring that the second 
largest progenitor halo at z = 0.25 is larger than 0.3 times 
the final halo mass. To extract the mass-spin relation, we 
define the quantity A = LogL — 1.5(LogM / M*). As apparent 
in figure [13L the A-distribution of the merged halos is biased 
toward larger A-values both for the simulation and for the 
PINOCCHIO halos, although the trend may be slightly un- 
derestimated by PINOCCHIO. 

The agreement at the object-by-object level is in line 
with the intrinsic limits of perturbative theories found by 
Porciani et al. (2001a). Figure ^ shows the alignment angle 
6 for the spins of cleanly matched FOF and PINOCCHIO 
halos, and their average values computed in bins of mass 
(error flags indicate the rms around the mean). While the 
left panel shows all halos, the right panel is restricted to 
those pairs of halos that overlap by more than 70 per cent. 
The average angle is significantly smaller than 90°, high- 
lighting a significant correlation of PINOCCHIO and FOF 
spins. However, the alignment is at best as high as ~60°. 
This is mostly due to errors in the definition of the halo, 
as shown by the right panel, where the best reconstructed 
halos with more than 1000 particles show an average align- 
ment angle of ~30-40°, consistent with the intrinsic limit 
quoted by Porciani et al. (2001a). 

To conclude, the prediction of angular momentum of 
halos is severely hampered by the intrinsic limits of lin- 
ear theory described by Porciani et al. (2001a) and fur- 
ther worsened by the error made by PINOCCHIO in as- 
signing particles to halos. The correct statistics is repro- 
duced only by introducing two more 'fudge' free parame- 
ters, while the object-by-object agreement is poor although 
significant. However, even N-body simulations do not con- 
verge rapidly in estimating this quantity (see Appendix B). 
Moreover, the important spin-merger correlation is recov- 
ered naturally. Although we do not claim this result as a big 
success, we notice that PINOCCHIO is, to our knowledge, 
the only perturbative algorithm able to predict the spin of 
halos at the object-by-object level. Moreover, the prediction 
of spins comes at almost no additional computational cost, 
and the whole acquisition history of angular momentum can 
be followed for each halo. Thus, we regard the use of the an- 
gular momenta provided by PINOCCHIO as a viable alter- 
native to drawing them at random from some distribution 
that fits N-body simulations (Cole et at. 2000; Vitvitsaka et 
al. 2001; Mailer, Dekel & Somerville 2001). 



6 DISCUSSION 

PINOCCHIO is an approximation to the full non-linear 
gravitational problem of hierarchical structure formation in 
a cosmological setting, in contrast to the mostly statistical 
approaches such as the PS prescription. The good agreement 
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Figure 13. Correlation between spin and merging history. Con- 
tinuous lines: all halos; dashed lines: halos that have suffered a 
major merger. Left panels: PINOCCHIO predictions; right pan- 
els: simulation. Upper panels: halos with M > 10 12 Mq; lower 
panels: halos with M > 10 13 Mq. 



Figure 14. Alignment angle 8 for pairs of cleanly assigned halos 
at z = as a function of mass. The errorbars denote averages in 
mass bins, errorbars give the rms of the mean, (a) all halos, (b) 
halos with / ov > 0.7. 



in detail between PINOCCHIO and FOF halos identified in 
simulations, explains the ability of the method to generate 
reliable halo catalogues. It also demonstrates that the un- 
derlying dynamical approximations work well. With respect 
to the results of Monaco (1995; 1997a,b), PINOCCHIO ad- 
dresses successfully the geometrical problem of the fragmen- 
tation of the collapsed medium into objects and filaments. 

While a direct analytical rendering of the fragmenta- 
tion prescription as used in PINOCCHIO seems very com- 
plex, because it requires knowledge of spatial correlations to 
high order, analytical progress might nevertheless be pos- 
sible. For instance, Monaco & Murante (1998) proposed to 
generalise the mass- radius relation of PS, to allow a more 
general distribution of masses to form at a given smoothing 
radius. This was formulated in terms of a 'growing' curve 
for the objects, that gives the fraction of mass acquired by 
the object at a given smoothing radius. The mass function 
is then obtained by a deconvolution of the f2(< a 2 ) function 
(as obtained from ELL collapse, like in figure |^) with the 
growing curve of the objects. This growing curve could be 
estimated from the results of PINOCCHIO, giving an im- 
proved analytical expression for the mass function. But in 
the case of Gaussian smoothing merging histories cannot be 
computed from the excursion set formalism, because the tra- 
jectories are strongly correlated (Peacock & Heavens 1990; 
Bond et al. 1991), so that the random walk formalism cannot 
be used. Moreover, it is impossible from such an approach to 
have full information on the spatial distribution of objects. 
So, such analytic extensions of PINOCCHIO would not be 
as powerful as the full analysis. Besides, analytic formalisms 
based on peaks (Manrique & Salvador Sole 1995; Hanami 
1999) are manageable only when linear theory is used. We 



therefore regard methods like PINOCCHIO which are based 
on an actual realisation of the linear density field, as a good 
compromise between performing a simulation, and getting 
only statistical information from a PS like approximation. 

As mentioned in the introduction, similar methods have 
been proposed in the literature, such as the peak-patch 
method of Bond & Myers (1996a), the block model of Cole 
& Kaiser (1988), and the merging cell model of Rodrigues 
& Thomas (1996) and Lanzoni et al. (2000). A qualitative 
comparison with peak-patch reveals a similar accuracy in re- 
producing the masses of the objects. From figure 10 of Bond 
& Myers (1996b) it is apparent that, in a context analogous 
to our SCDM simulation, masses are recovered with an ac- 
curacy of ~0.2 dex, not much worse than the one given in 
our figure [H] for SCDM. Unfortunately, it is not clear from 
the Bond & Myers papers to which extent the agreement can 
be pushed down to galactic masses. As linear theory under 
predicts the fraction of collapsed mass when the variance is 
large, a deficit of peaks corresponding to smaller masses is 
possible. The objects selected by peak-patch are constrained 
to be spherical in the Lagrangian space (they collapse like el- 
lipsoids but start-off as spheres perturbed by the tidal field), 
while PINOCCHIO is not restricted in this sense and is able 
to reproduce the orientation of the objects in the Lagrangian 
space, as mentioned in Section 5. Moreover, PINOCCHIO 
is not affected by the problem of peaks overlapping in the 
Lagrangian space. Finally, peak-patch has never been ex- 
tended, to the best of our knowledge, to predict the merger 
histories of objects. 

The merging cell model of Lanzoni et al. (2000) shares 
some properties with PINOCCHIO, in particular the fact 
that both codes build-up halos through mergers and accre- 
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tion. However, the non-linear ellipsoidal collapse of PINOC- 
CHIO is an important improvement, as is the use of Gaus- 
sian filters instead of box car smoothing. Also, the size of 
the merging objects tend to be quite large in the merging 
cell model, whereas PINOCCHIO allows accretion of sin- 
gle particles. We have been able to compare our results di- 
rectly with those of Lanzoni et al. (2000). The halos identi- 
fied in the merging cell model do not accurately reproduce 
those from the simulations. This poorer level of agreement 
is partly due to the cubic shape of the cells and to the coarse 
resolution of the box car smoothing. As a consequences of 
these choices, massive halos appear as big square boxes, and 
the mass function shows fluctuations with spacing of factors 
of two that reflect the smoothing. 

6.1 First-axis versus third-axis collapse 

Recently there has been extensive discussion in the litera- 
ture about whether the collapse of the first axis is enough to 
characterise gravitational collapse, or whether all three axes 
should reach vanishing size (Bond & Myers 1996a; Audit et 
al. 1997; Lee & Shandarin 1999; Sheth et al. 2001). Here we 
try to clarify this issue, showing that apparently contradic- 
tory claims result from different interpretations of ellipsoidal 
collapse, and from the choice of smoothing window. 

As described in section 2.1, ellipsoidal collapse can be 
considered as a truncation of LPT, a convenient descrip- 
tion of the dynamical evolution of a mass element. In other 
words, ELL does not attempt to describe the collapse of 
an extended ellipsoidal peak, rather, it operates on the in- 
finitesimal level. Given this, OC appears as the most sensi- 
ble choice for the collapse condition, for the reasons already 
outlined in Section 2.1, and with the caveat that the mass 
undergoing OC may end up either in halos or in filaments. 
OC corresponds to collapse along the first axis, which means 
that the ellipsoid has undergone pancake collapse. However, 
this does not imply that the extended region is flattened as 
well. Indeed, as the example in Monaco (1998) illustrates, 
in the collapse of a spherical peak with decreasing density 
profile, all mass elements (except for the one in the centre) 
collapse as needles pointing to the centre. This is because 
the spherical symmetry guarantees that the first and second 
axis collapse together. Yet the collapse of the peak is not 
that of a filament but of a sphere. This shows how mislead- 
ing the local geometry of collapse is for understanding the 
global geometrical properties of the collapsing matter. 

Alternatively, ellipsoidal collapse can be used to model 
extended regions associated to a particular set of points, 
such as density peaks (Bond & Myers 1996; Sheth et al. 
2001). In this case, first-axis collapse truly corresponds to 
the formation of a flattened structure, while third-axis col- 
lapse corresponds to the formation of a spheroidal object. 
For instance, in the case of the spherical peak mentioned 
above, the peak point is collapsing in a spherical way both 
locally and globally. It is clear that in such cases a satis- 
factory definition of collapse must be related to third-axis 
collapse. Sheth et al. (2001) showed that indeed this collapse 
condition improves the agreement with simulations when the 
centres of mass of FOF objects are considered (a particular 
set of points analogous to the peaks), but does not help 
much when general unconstrained points are considered. 

The two definitions of collapse are very different from 



many points of view. First-axis collapse is on average 
faster than the spherical one (Bertschinger & Jain 1994), 
while third-axis collapse is correspondingly slower. More- 
over, while 50 per cent of mass is predicted to collapse at 
very late times by linear theory (starting from a density 
field with finite variance and not taking into account the 
cloud-in-cloud problem), 23/25 ~ 92 per cent of mass is 
predicted to undergo first-axis collapse, but only 8 per cent 
third-axis collapse. This is very important when computing 
the mass function with a PS-like approach: while first-axis 
collapse more or less reproduces the correct normalisation 
(Monaco 1997b), third-axis collapse requires a large 'fudge 
factor' ~12 (Lee & Shandarin 1998), as only 8 per cent of 
mass is available for collapse. 

Whithin the framework of the excursion set approach, 
it is interesting to understand whether the introduction of 
ellipsoidal collapse is going to improve the statistical agree- 
ment between simulations and PS. Monaco (1997b) and 
Sheth et al. (2001) showed that ellipsoidal collapse can be 
introduced through a 'moving' barrier which depends on the 
variance o 2 of the smoothed field. Third-axis collapse gives 
longer collapse times than spherical collapse, and this corre- 
sponds to a barrier which rises with a 2 , while the opposite 
is true for first-axis collapse. In the case of sharp k-space 
smoothing, the fixed barrier reproduces the PS mass function 
and hence overestimates the number of low mass objects. 
Sheth et al. (2001) showed that using the moving barrier 
appropriate for third-axis collapse leads to the formation 
of fewer low mass objects, and hence improves the mass 
function. However, when Gaussian smoothing is used, the 
fixed-barrier solution is different from PS, and the number 
of small mass halos is now severely underestimated. Monaco 
(1997b, 1998b) showed that in this case first-axis collapse 
(with no free parameter to tune) produces a reasonable fit 
to the simulations, with some improvement with respect to 
PS. 

From these considerations, it is clear that a success- 
ful definition of collapse depends on many technical details, 
such as the kind of dynamics considered (mass elements ver- 
sus extended regions) and the type of smoothing used (sharp 
fc-space versus Gaussian smoothing). We choose to consider 
Gaussian smoothing and first-axis collapse (OC) applied to 
mass elements. These choices are consistent with the ex- 
cursion set approach, but need to be supplemented by an 
algorithm to fragment the collapsed medium into halos and 
filaments. This is because the collapse definition operates 
on mass elements and does not specify the larger structures 
that collapse together. Moreover, the strong correlation of 
Gaussian trajectories in the F — R plane implies that merger 
histories cannot be recovered with the same simple and ele- 
gant algorithm used by Bond et al. (1991) and Lacey & Cole 
(1993). The alternative algorithm proposed by Sheth et al. 
(2001), based on sharp fc-space smoothing, has the advan- 
tadge of being analytical and simpler than PINOCCHIO. 
However, the choice of third-axis collapse is physically mo- 
tivated by comparing the collapse of the centres of mass of 
FOF halos to the simulations, but the probability distribu- 
tion used in the excursion set approach is the unconstrained 
one of the general points. Moreover the mass of the objects is 
still estimated as if top-hat smoothing were used. So, we re- 
gard Sheth et al. (2001) as another phenomenological model, 
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yet improved with respect to PS and very effective in pro- 
viding statistical information. 



lie version of PINOCCHIO is available at the web site 



http : // www. daut .univ . tricstc . it /pinocchio 



7 CONCLUSIONS 

We have presented a detailed description of PINOCCHIO, a 
fast and perturbative approach for generating catalogues of 
DM halos in hierarchical cosmologies. Given a set of ini- 
tial conditions, PINOCCHIO produces masses, positions, 
velocities and angular momenta for a catalogue of halos. 
Because PINOCCHIO is based on reconstructing the merg- 
ers of halos, accurate information on the progenitors of ha- 
los is available automatically (Taffoni et al. 2001). We have 
compared in detail these catalogues with two N-body sim- 
ulations which use different cosmologies, resolutions, mass 
ranges and N-body codes. The match is very good, both 
for statistical quantities, which are recovered with a ~5 per 
cent accuracy for the mass function and ~20 per cent for the 
correlation function (~10 per cent error in ro), and at the 
object-by-object level, whereas a ~20-40 per cent accuracy 
in halo mass is reached for >70-100 per cent of the objects 
that have at least 30-100 particles. These results show that 
PINOCCHIO is a proper approximation of the gravitational 
problem, and not simply a phenomenological model able to 
reproduce some particular aspects of gravitational collapse. 

PINOCCHIO consists of two steps. In the first step, the 
estimate of collapse time, no free parameter is introduced, 
as collapse is defined as OC. The second step addresses the 
geometrical problem of the fragmentation of the collapsed 
medium into objects and the disentanglement of the filament 
web. This is analogous to the process of clump finding in 
N-body simulations, and requires the introduction of free 
parameters, at least one to specify the level of overdensity 
at which halos are selected (indeed, while algorithms like 
FOF or SO introduce only one parameter, others like HOP 
introduce more than one) . In fact five free parameters (that 
are not independent) are introduced, two to characterize the 
events of merging and accretion, and the others for fixing 
resolution effects. They are tuned by reproducing the FOF 
numerical mass function with linking length equal to 0.2. 

PINOCCHIO is fast and can be run even with small 
computers: all the tests presented in this paper were run 
with a simple PC with Pentium III 450MHz processor and 
512M of RAM. For a grid of 256 3 particles the first step runs 
in ~6 hours independent of the degree of non-linearity, while 
the second step requires only a few minutes. The typical 
outcome of such a run is a catalogue of many thousands of 
objects with known positions, merger histories and angular 
momenta. With a supercomputer one could run tens of large, 
say 512 3 , realisations in a fraction of the time required by 
a single one to be run with a standard N-body code, and 
obtain all the merger histories without the expensive post- 
processing analysis required in the case of simulations. 

The results of PINOCCHIO are suitable for studies 
of astrophysical events in a cosmological context, as they 
give essentially most of the information that a large-volume 
N-body simulation can give. In particular, the availabil- 
ity of catalogues with final positions, merger histories and 
angular momentum makes PINOCCHIO a suitable tool 
to be used in the context of galay formation. A pub- 



ACKNOWLEDGMENTS 

The authors thank Stefano Borgani, Fabio Governato, Bar- 
bara Lanzoni amd Cristiano Porciani for many discussions. 
Fabio Governato, Tom Quinn and Joachim Stadel have 
kindly provided the 360 3 SCDM simulations used in this 
paper. PM acknowledges support from MURST. TT ac- 
knowledges support by the 'Formation and Evolution of 
Galaxies' network set up by the European Commission un- 
der contract ERB FMRX-CT96086 of its TMR programme, 
and thanks PPARC for the award of a post-doctoral fellow- 
ship. Research conducted in cooperation with Silicon Graph- 
ics/Cray Research utilising the Origin 2000 super computer 
at DAMTP, Cambridge. 



REFERENCES 

Antonuccio-Delogu V., Becciani U. 
Colafrancesco S.. Germans, A., 



Pagliaro A., van Kampen E., 
Gambera M., 2001, MNRAS, 
in press (astro-ph/0009495) 
Audit E., Teyssier R., Alimi J. M., 1997, A&A, 325, 439 
Bardeen J.M., Bond J.R., Kaiser N., Szalay A.S., 1986, ApJ, 304, 
15 

Barnes J., Efstathiou G. P., 1987, ApJ, 319, 575 

Benson A.J., Cole S., Frenk C.S., Baugh C.M., Lacey C.G,. 2000, 

MNRAS, 311, 793 
Bertschinger E., Jain B., 1994, ApJ, 431, 486 
Bode P., Ba hcall N.A.. Ford E .B.. Ostriker J. P., 2000, submitted 



to ApJ ( [a.stro-ph/0011376 ) 
Bond J.R., Cole S., Efstathiou G., Kaiser N., 1991, ApJ, 379, 440 
Bond J.R., Myers S.T., 1996a, ApJS, 103, 1 
Bond J.R., Myers S.T., 1996b, ApJS, 103, 41 
Borgani S., Coles P., Moscardini L., 1994, MNRAS, 271, 223 
Bouchet F., 1996, in Dark Matter in the Universe, ed. S. 

Bonometto et al. IOS, Amsterdam 
Buchert T., 1992, A&A, MNRAS, 254, 729 

Buchert T., 1996, in Dark Matter in the Universe, ed. S. 

Bonometto et al. IOS, Amsterdam 
Buchert T., Ehlers J., 1993, MNRAS 264, 375 
Bullock J.S., Dekel A., Kolatt T.S., Kravtsov A.V., Klypin A. A., 

Porciani A., Primack J.R., 2001, ApJ, 555, 240 
Catelan P., 1995, MNRAS, 276, 115 

Catelan P., Lucchin F., Matarrese S., Porciani C., 1998, MNRAS, 
297, 692 

Catelan P., Theuns T., 1996a, MNRAS, 282, 436 
Catelan P., Theuns T., 1996b, MNRAS, 282, 455 
Cavaliere A., Colafrancesco S., Menci N., 1992, ApJ, 392, 41 
Cavaliere A., Menci N., Tozzi P., 1996, ApJ, 464, 44 
Cole S., Kaiser N., 1988, MNRAS, 233, 637 
Cole S., Lacey C.G., 1996, MNRAS, 281, 716 
Cole S., Lacey C.G., Baugh CM., Frenk C.S., 2000, MNRAS, 
319, 168 

Coles P. Melott A.L., Shandarin S.F., 1993, MNRAS, 260, 765 
Couchman H., Thomas P., Pearce F., 1995, ApJ 452 797 
iaferio A., Kauffmann G., Colberg J.M., White S.D.M., 1999, MN- 
RAS, 307, 537 

Efstathiou, G., Frenk, C.S., White, S.D.M., Davis, M., 1988, MN- 
RAS, 235, 715 
Eisenstein D.J., Hut P., 1998, ApJ, 498, 137 
Gardner J. P., 2001, ApJ, 557, 616 



© 0000 RAS, MNRAS 000, 000-000 



PINOCCHIO 23 



Governato F., Babul A., Quinn T., Tozzi P., Baugh CM., Katz 

N., Lake G., 1999, MNRAS, 307, 949 
Hanami H., 1999, submitted to MNRAS ( astro-ph/9910033) 
Heavens A., Peacock J., 1988, MNRAS, 232, 339 
Hoyle F., 1949, in Burgers J.M., van der Hulst H.C., eds., Prob- 
lems of Cosmical Aerodynamics, Central Air Document Of- 
fice, Dayton, p. 195 
Jenkins A., Frenk C.S., White S.D.M., Colberg J.M., Cole S., 
Evrard A.E., Couchman H.M.P., Yoshida N., 2001, MNRAS, 
321, 372 

Ker scher M., Buchert T.. Futamase T., 2000, submitted to ApJ 



(astro-ph/0007284) 



2000, MNRAS, 312, 



6 



319 



Monaco P. 
Monaco P. 
Monaco P. 



Lacey C, Cole S., 1993, MNRAS, 262, 627 
Lacey C, Cole S., 1994, MNRAS, 271, 676 
Lanzoni B., Mamon G.A., Guiderdoni B. 
781 

Lemson G., Kauffmann G., 1998, MNRAS, 302, 111 
Lee J., Shandarin S.F., 1998, ApJ, 500, 14 

Mail er A.H.. Dekel A Somerville R.S., 2001, MNRAS, in press 

( astro-ph/0105168j ) 
Manrique A., Salvador-Sole E., 1995, ApJ, 453, 
Mo H., White S.D.M., 1996, MNRAS, 282, 189 
Mo H., Mao S., White S. D. M., 1998, MNRAS, 295, 
Monaco P., 1995, ApJ, 447, 23 
Monaco P., 1997a, MNRAS, 287, 753 
1997b, MNRAS, 287, 753 

1998, Fund. Cosm. Phys., 19, 153 

1999, in Observational Cosmology: the development 
of galaxy systems, ed. G. Giuricin, M. Mezetti & P. Salucci. 
ASP Conf. Ser. 176. Pag. 186 

Monaco P., Murante C, 1999, Phys. Rev. D, 60, 0635XX 
Monaco P., Theuns T., Taffoni G., Governato F., Quinn T., Stadel 

J., 2001, ApJ, in press (paper I) 
Moutarde F.,Alimi J. M. ,Bouchet F. R.,Pellat R., Ramani A., 

1991, ApJ 382, 377 
Nagashima M., Gouda N., 1998, MNRAS, 301, 849 
Padmanabhan T., 1993, Structure Formation in the Universe. 

Cambridge University Press 
Peacock J. A. Heavens A. F., 1985, MNRAS, 217, 805 
Peacock J. A. Heavens A. F., 1990, MNRAS, 243, 133 
Peebles P.J.E., 1993, Principles of Physical Cosmology, Princeton 

Univeristy Press, Princeton 
Porciani C, Catelan P., Lacey C, 1999, ApJ, 513, L99 

Hoffmann Y., 2001a, MNRAS, in press 



Hoffmann Y., 2001b, MNRAS, in press 



Por ciani C. Dekel A. , 

( astro- ph/0105123; ) 
Por ciani C. Dekel A. , 

( astro- ph/0105165| ) 
Press W.H., Schechter P., 1974, ApJ, 187, 425 

Rodrigues D.D.C., Thomas P.A., 1996, MNRAS, 282, 631 

Scoccim arro R. Sheth R.K., 2001, submitted to MNRAS ( astro- 

ph/( flQ6120[ ) 

Shandarin S.F., Zel'dovich Ya.B.,1989, Rev. Mod. Phys., 61, 185 
Sheth R.K., Lemson G., 1999, MNRAS, 305, 946 
Sheth R.K., Mo H., Tormen G., 2001, MNRAS, in press 
Sheth R.K., Tormen G., 1999, MNRAS, 308, 119 
Sheth R.K.. T ormen G., 2001, MNRAS, in 
ph/( fl05113| ) 

Somerville R.S., Kolatt T.S., 1999, MNRAS, 305, 1 
Somerville R.S., Primack J.R., 1999, MNRAS, 310, 1087 
Taffoni G., Monaco P., Theuns T., 2001, MNRAS, submitted 
Theuns T., Leonard A., Efstathiou G., Pearce F.R., Thomas P.A. 

1998, MNRAS, 301, 478 
Theuns T., Taffoni G., Monaco P., 2001, submitted to MNRAS 
Ueda H., Shimasaku K., Suginohara T., Suto Y., 1994, PASJ, 46 

319 

Vitvitsaka M., Klypin A.A.. Kravtsov A.V.. B ullock J.S., Wech- 

sler R.H., Primack J.R., astro- ph/0105349| 
White S.D.M., 1984, ApJ, 286, 38 



press ( astro 



White S.D.M., 1996, in Schaeffer R. et al., eds, Cosmology & 
Large-scale structure, Proc. 60th Les Houches School, Else- 
vier, p. 349 

White S.D.M., Frenk C.S., 1991, ApJ, 379, 52 
White S.D.M., Silk J., 1979, ApJ, 232, 1 

Zel'dovich YA. B., 1970, Astrofizika, 6, 319 (translated in Astro- 
physics, 6, 164 [1973]) 



APPENDIX A: PARAMETERS FOR MERGING 
AND ACCRETION 

The accretion and merging conditions given in equations |l^ 
and [li] work well when the halos contain sufficiently many 
particles. However, for smaller halos, the limiting distance 
fuRti or /mi?N may be comparable to the grid spacing. In 
this case, the Zel'dovich velocity v max needs to be very ac- 
curate in order that accretion or merging to take place, and 
this may lead PINOCCHIO to underestimate the number of 
very low mass objects. The simplest solution to this problem 
is to add a constant /,■ to the right hand side of equations |l4| 
and |l5|, of order of the grid spacing, as was done in paper I. 
This brings the number of parameters in PINOCCHIO to 
three, / a = 0.18, f m = 0.35 and / r = 0.70. 

However, when applied to the ACDM simulation, this 
choice produces a systematic excess of low mass objects at 
high redshift, of order ~20 per cent at z = 4 for objects of 
30 particles. This excess is barely noticeable in the SCDM 
simulation at z — 1.13 (see figure 1 of paper I). The ori- 
gin of this systematic effect is the following. The accuracy 
of LPT in estimating the velocities is not constant in time, 
but depends on the degree of non-linearity reached, wors- 
ening at later times. It can be measured by comparing the 
Zel'dovich displacements with those from the simulation, for 
particles that are just experiencing OC collapse, according 
to the F max f ield . 

In figure Al we show that the error in the displacement 
increases as the field becomes more non-linear. The rate of 



increase is very similar for the two cosmological models plot- 
ted. The errors in the displacements are much smaller than 
the displacements themselves, demonstarting the power of 
the Zel'dovich approximation. While the average displace- 
ment grows as b(t), its error grows as b(t) 1,7 . 

The fact that displacements are computed more ac- 
curately at earlier times has two important consequences. 
Firstly, the accuracy of the reconstruction of particle po- 
sition will degrade with time, as we illustrated in Section 
4. Secondly, objects at higher redshifts will tend to accrete 
mass more easily than at later times, for a given set of pa- 
rameters / a , fm and /,.. The reason is that, if a particle 
should accrete onto a halo at late times, we need to make 
these parameters sufficiently genereous so that the particle 
falls within d of the halo according to Eq. [l4|, even though we 
are unable to compute the position of the particle very accu- 
rately. But as a result, this may lead to too much accretion 
at earlier times, when the positions are more accurate. 

It is possible to improve PINOCCHIO to correct for 
this numerical problem. What is relevant in the fragmenta- 
tion code is not the absolute displacement of a particle, but 
the displacement relative to that of the halo. The distance 
between a collapsing particle and the centre of mass of a 
group is d ~ S a ,b x Rn- Considering that S a ,b oc b, its vari- 
ance scales as the variance a 2 of the linear density and the 



© 0000 RAS, MNRAS 000, 000-000 



24 Monaco, Theuns & Taffoni 



relative error on S a ,b grows oc 6(i) ' 7 , we can estimate the 
uncertainty on d, given the errors in reconstructing positions 
as 

8d = f s a(R N )R N b(t)[a(R N )b(t)] - 7 . (Al) 

Here, / s is another free parameter. We only introduce this 
extra parameter in the accretion condition, since the results 
do not improve when we apply a similar correctio to the 
merger condition. The accretion and merging conditions are 
then: 

d < / a X R N + f ra + Sd (A2) 

d < / m x max(Rm,Rm) + frm- (A3) 

We note that the resolution parameter / r is now different 
for accretion and for merging. 

Our algorithm now contains five parameters. The best 
fit values have been determined by generating many reali- 
sations of Gaussian fields (including the initial conditions of 
the SCDM, ACDM and ACDM128 simulations used here) 
for different cosmological models, box sizes and resolutions, 
and determining for each realisation those parameters that 
best fit corresponding mass function. We used the analyti- 
cal mass function of Jenkins et al. (2001) as template. The 
best fit is easily achieved, as the effects of small variations 
of only one parameter are rather simple. In particular, / m 
determines the overall slope of the mass function, f m the 
slope at low masses, / a the normalisation, / ra the abundance 
of low mass halos and / s the abundance of low mass halos 
at low redshifts. 

The best fit values are / m = 0.35 and / rm = 0.7, as 
in paper I. The parameters for accretion are found to be 
correlated, 

/ ra = 0.40 - 3.5 (/ a - 0.22) . (A4) 

In addition, / ra correlates with the degree of non-linearity, 
as quantified by E = u(R = 0)// gr id- Here, a(R = 0) is the 
variance at the level of the grid and l gI id the grid spacing. E 
is sensitive to both the degree of non-linearity reached and 
the level of accuracy of the Zel'dovich displacements. The 
best fit for / a is 

/a = 0.22 + (log E- 0.36)* 0.11. (A5) 

We also demand that 0.22 < / a < 0.26. The best fit / s = 
0.06. (In all cases a change in the last significant digit gives 
differences in the mass function appreciable at the 5 per cent 
level) . 

Unfortunately, these parameters are sensitive to the 
algorithm used to generate initial conditions, in particu- 
lar they depend on how the small scale power close to 
the Nyquist frequency is quenched. For the ACDM mod- 
els, we used the initial conditions generator distributed with 
HYDRA (Couchman et al. 1995), where power below the 
Nyquist frequency (on a grid with unit grid spacing, taken to 
be k e — O.871-), is quenched exponentially oc exp(— (k/k e ) 16 ). 
The parameters for PINOCCHIO apply for this type of ini- 
tial conditions generator. In contrast, the initial conditions 
for the 360 3 SCDM simulation were generated on a 180 
grid, without an additional cut-off of small scale power. 
The corresponding PINOCCHIO parameters are / a = 0.19, 
/ ra = 0.60 and / s = 0.04. 

It is possible there are other degeneracies amongst these 
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Figure Al. Error in the estimate of the Zel'dovich displacements 
for particles that have just undergone orbit crossing, as function 
of the growing mode b. Continuous lines are the average displace- 
ment of the collapsing particles, dashed lines are the error in the 
estimate of these displacements, as computed by comparing with 
the simulation results. Thick lines are obtained from the ACDM 
simulation, thin lines from the ACDM128 simulation. 

parameters. We have verified through extensive analysis that 
the object-by-object agreement is rather insensitive to the 
precise values, once the mass function fits well, the object- 
by-object agreement is good too. We have tried many other 
recipes for the parameters, but this one is adequate for gen- 
crating reliable halo catalogues for a wide variety of cosmo- 
logical models. 



APPENDIX B: RELIABLE ESTIMATE OF THE 
ANGULAR MOMENTUM OF A DM HALO 

It was mentioned in Section 5 that a reliable estimate of 
the angular momentum of an N-body halo requires at least 
100 particles. As this matter is of interest to many N-body 
simulators, we give here details of the analysis we have per- 
formed. 

This matter can be addressed by using our ACDM and 
ACDM128 simulations, recalling that ACDM128 is run on 
the same initial conditions as ACDM, resampled on the 
coarser grid. We consider the z = outputs of the two sim- 
ulations and match the halo catalogues in exactly the same 
way as done in the object-by-object comparison of PINOC- 
CHIO and N-body catalogues. In practice, the 256 3 linking 
list is resampled to 128 3 by nearest grid assignment, i.e. sim- 
ply by considering 1 particle over 8 and skipping the others. 
Notice that this resampling is used only to match halo pairs, 
the halo properties are computed from the complete lists of 
particles. In the following we will assume the properties of 
the 256 3 groups as bona fide estimate, and will interpret the 
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difference between 128 3 and 256 3 as the error on the lower- 
resolution groups. 



Figure Bl i shows the fractions / c i, /s P nt and f ov for the 
matching of the two catalogues, as a function of the mass 
of the halo according to the ACDM128 simulation. In this 
and subsequent panels the vertical line marks the groups of 
100 particles (128 3 ). The matching of the two catalogues is 
excellent for groups larger than 100 particles, but still rea- 
sonable for groups as small as ~30 particles. Mass estimates 



are pretty stable (figure Bl a), with an error of 30-40 per cent 
for the smallest groups, decreasing to the high-mass end. 
Conversely, the error on the spin estimate turns out to 



be much larger. Figure Blc shows the fractional difference 



between halo spins as a function of mass (the rms difference 



is also shown), while figure Bid shows the alignment an- 



gles of the spins (the rms of the mean is shown in this case, 
as in figure Q). The rms difference is still in excess of a a 
factor of two for halos of 100 particles, and even larger for 
smaller halos. Moreover, the spin directions of small halos 
are very poorly correlated for halos with less than 100 par- 
ticles. We conclude that the lower limit for a correct order- 
of-magnitude estimate of the angular momentum of a halo 
is 100 particles, while a more precise estimate will require 
at least ten times more particles. 
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Figure Bl. In all the panels the vertical line marks groups of 100 particles in the ACDM128 simulation, (a) Matching of the ACDM and 
ACDM128 halo catalogues. Continuous line: / c i; clashed line: / sp nt; dotted line: f ov . (b) Correlation of masses for the cleanly assigned 
objects. (c) Fractional difference of angular momenta for the cleanly assigned objects, as a function of mass. Error bars give the rms 
difference in bins of mass, (d) Alignment angle between the angualr momenta of cleanly assigned objects. Error bars give the rms of the 
mean of the alignment angles in bins of mass. 



© 0000 RAS, MNRAS 000, 000-000 



